<!DOCTYPE html>
<html lang="zh">
    <head>
        <meta charset="utf-8">
        <meta http-equiv="X-UA-Compatible" content="IE=edge,chrome=1">
        <title>SVM 公式推导 | Dirac Lee</title><meta name="viewport" content="width=device-width, initial-scale=1.0">
<meta name="robots" content="noodp" />
<meta name="Description" content="Dirac Lee 的博客"><link rel="prev" href="https://diraclee.gitee.io/2020/10/sklearn-%E5%B8%B8%E7%94%A8%E6%A8%A1%E5%9E%8B%E5%8F%82%E6%95%B0/" /><link rel="next" href="https://diraclee.gitee.io/2020/12/%E6%8E%A8%E8%8D%90%E7%B3%BB%E7%BB%9F%E7%BB%BC%E8%BF%B0/" /><link rel="canonical" href="https://diraclee.gitee.io/2020/11/svm/" />
<link rel="shortcut icon" type="image/x-icon" href="/favicon.ico" />
<link rel="apple-touch-icon" sizes="180x180" href="/apple-touch-icon.png">
<link rel="icon" type="image/png" sizes="32x32" href="/favicon-32x32.png">
<link rel="icon" type="image/png" sizes="16x16" href="/favicon-16x16.png">
<link rel="manifest" href="/site.webmanifest">
<link rel="mask-icon" href="/safari-pinned-tab.svg" color="#5bbad5">
<meta name="msapplication-TileColor" content="#da532c">
<meta name="theme-color" content="#ffffff"><meta property="og:title" content="SVM 公式推导" />
<meta property="og:description" content="" />
<meta property="og:type" content="article" />
<meta property="og:url" content="https://diraclee.gitee.io/2020/11/svm/" />
<meta property="article:published_time" content="2020-11-20T00:00:00+00:00" />
<meta property="article:modified_time" content="2020-11-20T00:00:00+00:00" />
<script type="application/ld+json">
    {
        "@context": "http://schema.org",
        "@type": "BlogPosting",
        "headline": "SVM 公式推导",
        "mainEntityOfPage": {
            "@type": "WebPage",
            "@id": "https:\/\/diraclee.gitee.io\/2020\/11\/svm\/"
        },"image": {
                "@type": "ImageObject",
                "url": "https:\/\/diraclee.gitee.io\/logo.png",
                "width":  800 ,
                "height":  600 
            },"genre": "posts","keywords": "机器学习, SVM","wordcount":  2993 ,
        "url": "https:\/\/diraclee.gitee.io\/2020\/11\/svm\/","datePublished": "2020-11-20T00:00:00\x2b00:00","dateModified": "2020-11-20T00:00:00\x2b00:00","license": "This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.","publisher": {
                "@type": "Organization",
                "name": "xxxx",
                "logo": {
                "@type": "ImageObject",
                "url": "https:\/\/diraclee.gitee.io\/logo.png",
                "width":  127 ,
                "height":  40 
                }
            },"description": ""
    }
    </script><link rel="stylesheet" href="/css/style.min.css"><link rel="stylesheet" href="/css/lib/fontawesome-free/all.min.css"><link rel="stylesheet" href="/css/lib/animate/animate.min.css"></head>
    <body><script>
            window.isDark = (window.localStorage && window.localStorage.getItem('theme')) === 'dark';
            window.isDark && document.body.classList.add('dark-theme');
        </script><div class="wrapper"><nav class="navbar">
    <div class="navbar-container">
        <div class="navbar-header animated bounceIn">
            <a href="https://diraclee.gitee.io">Dirac Lee</a>
        </div>
        <div class="navbar-menu"><a class="menu-item" href="https://diraclee.gitee.io/posts" title="">文章</a><a class="menu-item" href="https://diraclee.gitee.io/tags" title="">标签</a><a class="menu-item" href="https://diraclee.gitee.io/categories" title="">分类</a><a class="menu-item" href="https://diraclee.gitee.io/about" title="">关于</a><a class="menu-item" href="https://diraclee.gitee.io" title="English"><i class="fas fa-language fa-fw"></i></a><a href="javascript:void(0);" class="theme-switch"><i class="fas fa-adjust fa-rotate-180 fa-fw" title="切换主题"></i></a>
        </div>
    </div>
</nav><nav class="navbar-mobile">
    <div class="navbar-container">
        <div class="navbar-header">
            <div class="navbar-header-title animated bounceIn">
                <a href="https://diraclee.gitee.io">Dirac Lee</a>
            </div>
            <div class="menu-toggle" id="menu-toggle">
                <span></span><span></span><span></span>
            </div>
        </div>
        <div class="navbar-menu" id="mobile-menu"><a class="menu-item" href="https://diraclee.gitee.io/posts" title="">文章</a><a class="menu-item" href="https://diraclee.gitee.io/tags" title="">标签</a><a class="menu-item" href="https://diraclee.gitee.io/categories" title="">分类</a><a class="menu-item" href="https://diraclee.gitee.io/about" title="">关于</a><a class="menu-item" href="https://diraclee.gitee.io" title="English"></a><a href="javascript:void(0);" class="theme-switch"><i class="fas fa-adjust fa-rotate-180 fa-fw" title="切换主题"></i></a>
        </div>
    </div>
</nav><main class="main">
                <div class="container"><article class="page"><h1 class="post-title animated flipInX">SVM 公式推导</h1><div class="post-meta">
            <div class="post-meta-main"><a class="author" href="https://diraclee.gitee.io" rel="author" target="_blank">
                    <i class="fas fa-user-circle fa-fw"></i>Dirac Lee
                </a>&nbsp;<span class="post-category">收录于&nbsp;<i class="far fa-folder fa-fw"></i><a href="https://diraclee.gitee.io/categories/%E6%9C%BA%E5%99%A8%E5%AD%A6%E4%B9%A0/">机器学习</a>&nbsp;</span></div>
            <div class="post-meta-other"><i class="far fa-calendar-alt fa-fw"></i><time datetime=2020-11-20>2020-11-20</time>&nbsp;
                <i class="fas fa-pencil-alt fa-fw"></i>约 2993 字&nbsp;
                <i class="far fa-clock fa-fw"></i>预计阅读 6 分钟&nbsp;</div>
        </div><div class="post-content"><a class="post-dummy-target" id="线性可分svm"></a><h2>线性可分SVM</h2>
<a class="post-dummy-target" id="算法"></a><h3>算法</h3>
<a class="post-dummy-target" id="输入"></a><h4>输入</h4>
<p>线性可分训练集 $ T = \lbrace (x^{(1)}, y^{(1)}), (x^{(2)}, y^{(2)}), \dots, (x^{(N)}, y^{(N)}) \rbrace $ ，其中 $x^{(i)} \in R^n$，$y^{(i)} \in \lbrace -1, +1 \rbrace$，$i = 1, 2, \dots, N$</p>
<a class="post-dummy-target" id="输出"></a><h4>输出</h4>
<p>分离超平面和分类决策函数</p>
<a class="post-dummy-target" id="步骤"></a><h4>步骤</h4>
<ol>
<li>
<p>构造并求解约束最优化问题
$$
\begin{aligned}
&amp;\min_{\alpha} \frac 1 2 \sum_{i=1}^N \sum_{j=1}^N \alpha_i \alpha_j y^{(i)} y^{(j)} (x^{(i)} \cdot x^{(j)})  \\ 
&amp;s.t. \quad
\begin{cases}
\sum_{i=1}^N \alpha_i y^{(i)} = 0 \\ 
\alpha_i \ge 0, \quad i = 1, 2, \cdots, N
\end{cases}
\end{aligned}
$$</p>
</li>
<li>
<p>求得最优解 $\alpha^\star = \begin{bmatrix} \alpha_1^\star \\ \alpha_2^\star \\ \vdots \\ \alpha_N^\star \end{bmatrix}  $</p>
</li>
<li>
<p>计算
$$
w^\star = \sum_{i=1}^N \alpha_i^\star y^{(i)} x^{(i)}
$$
并选择一个 $\alpha^\star$ 的一个正分量 $\alpha_i^\star &gt; 0$，计算
$$
b^\star = y^{(i)} -  \sum_{j = 1}^N \alpha_j^\star y^{(j)} (x^{(j)} \cdot x^{(i)})  \qquad s.t. \quad \alpha_i^\star &gt; 0
$$</p>
</li>
<li>
<p>求得分离超平面
$$
w^\star \cdot x + b^\star = 0
$$
分类决策函数
$$
f(x) = sign(w^\star \cdot x + b^\star)
$$</p>
</li>
</ol>
<a class="post-dummy-target" id="推导"></a><h3>推导</h3>
<a class="post-dummy-target" id="函数间隔与几何间隔"></a><h4>函数间隔与几何间隔</h4>
<p>样本点 $(x^{(i)}, y^{(i)})$ 的函数间隔
$$
\hat \gamma^{(i)}  = y^{(i)} (w^T x^{(i)} + b)
$$</p>
<p>训练集 $T = \lbrace (x^{(i)}, y^{(i)}) | i = 1, 2, &hellip;, N  \rbrace$ 的函数间隔</p>
<p>$$
\hat \gamma = \min_{i = 1, &hellip;, N} \hat \gamma^{(i)}
$$</p>
<p>样本点 $(x^{(i)}, y^{(i)})$ 的几何间隔</p>
<p>$$
\gamma^{(i)}  = \frac {y^{(i)} (w^T x^{(i)} + b)} {|| w ||}
$$</p>
<p>训练集 $T = \lbrace (x^{(i)}, y^{(i)}) | i = 1, 2, &hellip;, N  \rbrace$ 的几何间隔</p>
<p>$$
\gamma = \min_{i = 1, &hellip;, N} \gamma^{(i)}
$$</p>
<p>由此可得，函数间隔与几何间隔的关系</p>
<p>$$
\begin{aligned}
&amp;\gamma_i = \frac {\hat \gamma_i} {|| w ||} \\ 
&amp;\gamma = \frac {\hat \gamma} {|| w ||}
\end{aligned}
$$</p>
<a class="post-dummy-target" id="svm-的优化目标"></a><h4>SVM 的优化目标</h4>
<p>$$
\max_{w, b} \gamma \qquad s.t. \quad \gamma = \min_{i = 1, &hellip;, N} \gamma^{(i)}
$$</p>
<p>即</p>
<p>$$
\max_{w, b} \gamma \qquad s.t. \quad \gamma^{(i)} \ge \gamma
$$</p>
<p>由函数间隔与几何间隔的关系，上优化目标等价于</p>
<p>$$
\max_{w, b} \frac {\hat \gamma} { ||w|| } \qquad s.t. \quad \hat \gamma^{(i)} \ge \hat \gamma
$$</p>
<p>等比例地缩放 $w$ 和 $b$ 为 $\lambda w$ 和 $\lambda b$，超平面不会改变，而训练集在该超平面的函数间隔将缩放为原来的 $\lambda$ 倍，但几何间隔不会改变。因此对于每一个候选超平面，可将其等比例缩放，使得其参数 $(w, b)$ 满足 $\hat \gamma = 1$，优化问题的解不会改变。如此，上优化目标可写为</p>
<p>$$
\min_{i = 1, &hellip;, N} \frac 1 2 ||w||^2 \qquad s.t. \quad \hat \gamma^{(i)} \ge 1
$$</p>
<a class="post-dummy-target" id="svm-优化目标是拉格朗日最优化问题"></a><h4>SVM 优化目标是拉格朗日最优化问题</h4>
<p>将上式中限制条件变形后得到如下拉格朗日最优化形式
$$
\min_{w, b} \frac 1 2 ||w||^2 \\ 
s.t. \quad
\begin{cases}	
1 - y^{(i)} (w^T x^{(i)} + b) \le 0  \\ 
0w + 0b = 0
\end{cases}
$$</p>
<p>即以 $(w, b)$ 为目标的拉格朗日最优化问题
$$
\min_{w, b} f(w, b) \\ 
s.t. \quad
\begin{cases}	
c_i(w, b) \le 0  \\ 
h_j(w, b) = 0
\end{cases}
$$</p>
<p>其中</p>
<p>$$
\begin{aligned}
f(w, b) &amp;= \frac 1 2 ||w||^2  \\ 
c_i(w, b) &amp;= 1 - \hat \gamma^{(i)} \\ 
&amp;= 1 - y^{(i)} (w^T x^{(i)} + b) \quad i = 1, &hellip;, N  \\ 
h_j(w, b) &amp;= 0w + 0b = 0
\end{aligned}
$$</p>
<a class="post-dummy-target" id="利用拉格朗日对偶性最优化-svm-目标"></a><h4>利用拉格朗日对偶性最优化 SVM 目标</h4>
<p>拉格朗日函数可写为
$$
L(w, b, \alpha) = f(w, b) + \sum_{i = 1}^N \alpha_i c_i(w, b)  \qquad \alpha_i \ge 0
$$</p>
<p>考虑如下函数</p>
<p>$$
\theta_P(w, b) = \max_{\alpha:\alpha_i \ge 0} L(w, b, \alpha)
$$</p>
<p>由拉格朗日对偶性得，以下最小化问题
$$
\min_{w, b} \theta_P(w, b) = \min_{w, b} \max_{\alpha:\alpha_i \ge 0} L(w, b, \alpha) = \min_{w, b} \max_{\alpha:\alpha_i \ge 0} (f(w, b) + \sum_{i = 1}^N \alpha_i c_i(w, b))
$$
与原始最优化问题
$$
\min_{x} f(x) \\ 
s.t. \quad
\begin{cases}	
c_i(x) \le 0  \\ 
h_j(x) = 0
\end{cases}
$$
等价。</p>
<p>该问题称为广义拉格朗日函数的极小极大问题，设其最优解为 $p^\star$，对应参数为 $w^\star$ 和 $b^\star$。</p>
<p>其对偶问题为广义拉格朗日函数的极大极小问题如下
$$
\max_{\alpha:\alpha_i \ge 0} \theta_D(\alpha) = \max_{\alpha:\alpha_i \ge 0} \min_{w, b} L(w, b, \alpha) = \max_{\alpha:\alpha_i \ge 0} \min_{w, b} (f(w, b) + \sum_{i = 1}^N \alpha_i c_i(w, b))
$$</p>
<p>设其最优解为 $d^\star$，对应参数为 $\alpha^\star$。</p>
<p>那么由 $f(w, b)$ 和 $c_i(w, b)$ 是凸函数，$h_j(w, b)$ 是仿射函数，且存在超平面 $(w, b)$ 使得所有样本点 $(x^{(i)}, y^{(i)})$ 都满足 $c_i(w, b) &lt; 0$ ，因此拉格朗日对偶性定理全部满足。</p>
<p>因此，原问题与对偶问题有解，且 $p^\star = d^\star = L(w^\star, b^\star, \alpha^\star)$，他们最优解参数满足 <strong>KKT 条件</strong> 如下
$$
\nabla_w L(w^\star, b^\star, \alpha^\star) = w^\star - \sum_{i = 1}^N \alpha_i^\star y^{(i)} x^{(i)} = 0  \\ 
$$</p>
<p>$$
\nabla_b L(w^\star, b^\star, \alpha^\star) = \sum_{i = 1}^N \alpha_i^\star y^{(i)} = 0  \\<br>
$$</p>
<p>$$
c_i(w^\star, b^\star) = 1 - y^{(i)} (w^{*T} x^{(i)} + b^\star) \le 0  \\ 
$$</p>
<p>$$
\alpha_i^\star c_i(w^\star, b^\star) = \alpha_i^\star (1 - y^{(i)} (w^{*T} x^{(i)} + b^\star)) = 0  \\<br>
$$</p>
<p>$$
\alpha_i^\star \ge 0
$$</p>
<a class="post-dummy-target" id="svm-的分类结果与非支持向量无关"></a><h4>SVM 的分类结果与非支持向量无关</h4>
<p>非支持向量 $(x^{(i)}, y^{(i)})$ 的函数间距 $y^{(i)} (w^{*T} x^{(i)} + b) &gt; 1$，因此 $c_i(w, b) \lt 0$，带入 <strong>KKT 的对偶互补条件</strong> $\alpha_i^\star c_i(x^\star) = 0$ 得 $\alpha_i^\star = 0$，因此 SVM 的分类结果与非支持向量无关。</p>
<a class="post-dummy-target" id="对偶问题求解"></a><h4>对偶问题求解</h4>
<p>求 $\theta_D(\alpha) = \min_{w, b} L(w, b, \alpha)$</p>
<p>由 KKT 条件得
$$
w^\star = \sum_{i = 1}^N \alpha_i^\star y^{(i)} x^{(i)} \\ 
\sum_{i = 1}^N \alpha_i^\star y^{(i)} = 0
$$</p>
<p>带入 $L(w, b, \alpha)$ 中得</p>
<p>$$
\begin{aligned}
\theta_D(\alpha)
&amp;= \min_{w, b} L(w, b, \alpha) \\ 
&amp;= \frac 1 2 (\sum_{i = 1}^N \alpha_i y^{(i)} x^{(i)}) \cdot (\sum_{j = 1}^N \alpha_j y^{(j)} x^{(j)}) - \sum_{i=1}^N \alpha_i y^{(i)} (\sum_{j = 1}^N \alpha_j y^{(j)} x^{(j)} \cdot x^{(i)} + b ) + \sum_{i=1}^N \alpha_i  \\ 
&amp;=\frac 1 2 \sum_{i = 1}^N \sum_{j = 1}^N \alpha_i \alpha_j y^{(i)} y^{(j)} (x^{(i)} \cdot x^{(j)}) - \sum_{i=1}^N \sum_{j = 1}^N \alpha_i \alpha_j y^{(i)}  y^{(j)} (x^{(j)} \cdot x^{(i)}) +  \sum_{i=1}^N \alpha_i  \\ 
&amp;= - \frac 1 2 \sum_{i = 1}^N \sum_{j = 1}^N \alpha_i \alpha_j y^{(i)} y^{(j)} (x^{(i)} \cdot x^{(j)}) +  \sum_{i=1}^N \alpha_i
\end{aligned}
$$</p>
<p>因此对偶优化问题可变形为</p>
<p>$$
\max_{\alpha} - \frac 1 2 \sum_{i = 1}^N \sum_{j = 1}^N \alpha_i^\star \alpha_j^\star y^{(i)} y^{(j)} (x^{(i)} \cdot x^{(j)}) +  \sum_{i=1}^N \alpha_i \\ 
s.t. \quad
\begin{cases}
\sum_{i = 1}^N \alpha_i y^{(i)} = 0  \\ 
\alpha_i \ge 0
\end{cases}
$$</p>
<p>即</p>
<p>$$
\min_{\alpha} \frac 1 2 \sum_{i = 1}^N \sum_{j = 1}^N \alpha_i \alpha_j y^{(i)} y^{(j)} (x^{(i)} \cdot x^{(j)}) - \sum_{i=1}^N \alpha_i \\ 
s.t. \quad
\begin{cases}
\sum_{i = 1}^N \alpha_i y^{(i)} = 0  \\ 
\alpha_i \ge 0
\end{cases}
$$</p>
<a class="post-dummy-target" id="由对偶问题的解求原问题的解"></a><h4>由对偶问题的解求原问题的解</h4>
<p>求得对偶优化问题最优解参数 $\alpha^\star$ 后，接下来就准备求原优化问题最优解参数 $w^\star$ 和 $b^\star$</p>
<p>由 KKT 条件得</p>
<p>$$
w^\star = \sum_{i = 1}^N \alpha^\star_i y^{(i)} x^{(i)}
$$</p>
<p>$$
\alpha_i^\star c_i(w^\star, b^\star) = \alpha_i^\star (1 - y^{(i)} (w^{*T} x^{(i)} + b^\star)) = 0
$$</p>
<p>因为支持向量 $(x^{(i)}, y^{(i)})$ 均满足 $c_i(w, b) = 0$ ，由 <strong>KKT 对偶互补条件</strong> 得，其对应  $\alpha_i &gt; 0$</p>
<p>当 $\alpha_i &gt; 0$ 时，上式可转换为
$$
1 - y^{(i)} (w^{\star T} x^{(i)} + b^\star) = 0
$$</p>
<p>将 $w^\star$ 带入得</p>
<p>$$
1 - y^{(i)} ( \sum_{j = 1}^N \alpha_j^\star y^{(j)} (x^{(j)} \cdot x^{(i)}) + b^\star ) = 0
$$</p>
<p>两边乘以 $y^{(i)}$ 得</p>
<p>$$
y^{(i)} -  ( \sum_{j = 1}^N \alpha_j^\star y^{(j)} (x^{(j)} \cdot x^{(i)}) + b^\star ) = 0
$$</p>
<p>因此</p>
<p>$$
b^\star = y^{(i)} -  \sum_{j = 1}^N \alpha_j^\star y^{(j)} (x^{(j)} \cdot x^{(i)}) \qquad s.t. \quad \alpha_i &gt; 0
$$</p>
<p>综上，对于线性可分 SVM，我们先求解对偶问题</p>
<p>$$
\min_{\alpha} \frac 1 2 \sum_{i = 1}^N \sum_{j = 1}^N \alpha_i \alpha_j y^{(i)} y^{(j)} (x^{(i)} \cdot x^{(j)}) - \sum_{i=1}^N \alpha_i \\ 
s.t. \quad
\begin{cases}
\sum_{i = 1}^N \alpha_i y^{(i)} = 0  \\ 
\alpha_i \ge 0
\end{cases}
$$</p>
<p>求得对偶问题最优解参数 $\alpha^\star$ 后，可以通过以下公式求得原问题最优解参数</p>
<p>$$
w^\star = \sum_{i = 1}^N \alpha^\star_i y^{(i)} x^{(i)}  \\ 
b^\star = y^{(i)} -  \sum_{j = 1}^N \alpha_j^\star y^{(j)} (x^{(j)} \cdot x^{(i)})  \qquad s.t. \quad \alpha_i^\star &gt; 0
$$</p>
<p>由 <strong>[定理2]</strong> 得，原问题最优解就等于对偶问题最优解。</p>
<a class="post-dummy-target" id="线性-svm"></a><h2>线性 SVM</h2>
<a class="post-dummy-target" id="非线性-svm"></a><h2>非线性 SVM</h2>
<a class="post-dummy-target" id="拉格朗日对偶性"></a><h2>拉格朗日对偶性</h2>
<a class="post-dummy-target" id="原问题"></a><h3>原问题</h3>
<p>$$
\min_{x} f(x) \\ 
s.t. \quad
\begin{cases}	
c_i(x) \le 0  \\ 
h_j(x) = 0
\end{cases}
$$</p>
<p>其中 $c_i(x)$ 和 $h_j(x)$ 是对于给定 $x$ ，样本 $(i, j)$ 的某种性质函数</p>
<a class="post-dummy-target" id="拉格朗日函数与原问题的关系"></a><h3>拉格朗日函数与原问题的关系</h3>
<p>拉格朗日函数可写为
$$
L(x, \alpha, \beta) = f(x) + \sum_{i = 1}^N \alpha_i c_i(x) + \sum_{j=1}^M \beta_j h_j(x)  \qquad \alpha_i \ge 0
$$</p>
<p>考虑如下函数</p>
<p>$$
\theta_P(x) = \max_{\alpha, \beta:\alpha_i \ge 0} L(x, \alpha, \beta)
$$</p>
<p>如果存在某个样本点 $(i, j)$ 不满足限制条件，即 $c_i(x) &gt; 0$ 或者 $h_j(x) \ne 0$，那么就有</p>
<p>$$
\theta_P(x) = \max_{\alpha, \beta:\alpha_i \ge 0} (f(x) + \sum_{i = 1}^N \alpha_i c_i(x) + \sum_{j=1}^M \beta_j h_j(x) ) = +\infin
$$</p>
<p>因为只要令 $\alpha_i \rightarrow +\infin$ ，$\beta_j h_j(x) \rightarrow +\infin$ 并令 $\alpha$ 和 $\beta$ 的其他维度为0，$L(w, b, \alpha)$ 就能取得最大值 $+\infin$。</p>
<p>反之如果所有样本点都满足限制条件，即对任意样本点 $(i, j)$ ，都有 $c_i(x) \le 0$ 并且 $h_j(x) = 0$，，那么就有</p>
<p>$$
\theta_P(x) = \max_{\alpha, \beta:\alpha_i \ge 0} (f(x) + \sum_{i = 1}^N \alpha_i c_i(x) + \sum_{j=1}^M \beta_j h_j(x) ) = f(x)
$$</p>
<p>当且仅当对于 $c_j(x) &lt; 0$ 的样本点对应的 $\alpha_j = 0$ 而对于 $c_i(x) = 0$ 的样本点对应的 $\alpha_i$ 取任意正值时（即对于任意的 $i$ 都有 $\alpha_i c_i(x) = 0$ 时 ）， $\sum_{i = 1}^N \alpha_i c_i(x)$ 最大
$$
\max \sum_{i = 1}^N \alpha_i c_i(x) = \sum_{ c_j(x) &lt; 0} 0 \cdot c_j(x) + \sum_{c_i(x) = 0} \alpha_i \cdot 0 = 0
$$
而 $\beta$ 取任意值，$\sum_{j=1}^M \beta_j h_j(x) = 0$ 恒成立。</p>
<p>此时 $L(x, \alpha)$ 就能取得最大值 $f(x)$ 
$$
\theta_P(x) = \max_{\alpha, \beta:\alpha_i \ge 0} L(x, \alpha) = \max_{\alpha, \beta:\alpha_i \ge 0} (f(x) + \sum_{i = 1}^N \alpha_i c_i(x) + \sum_{j=1}^M \beta_j h_j(x) ) \xlongequal{c_i(x) \le 0 且 h_j(x) = 0 } f(x)
$$
综上，有
$$
\theta_P(x) = 
\begin{cases}
f(x),    \qquad c_i(x) \le 0 且 h_j(x) = 0  \\ 
+\infin, \qquad otherwise
\end{cases}
$$</p>
<p>那么以下最小化问题
$$
\min_{x} \theta_P(x) = \min_{x} \max_{\alpha, \beta:\alpha_i \ge 0} L(x, \alpha, \beta) = \min_{x} \max_{\alpha, \beta:\alpha_i \ge 0} (f(x) + \sum_{i = 1}^N \alpha_i c_i(x) + \sum_{j=1}^M \beta_j h_j(x) )
$$
与原始最优化问题 
$$
\min_{x} f(x) \\ 
s.t. \quad
\begin{cases}	
c_i(x) \le 0  \\ 
h_j(x) = 0
\end{cases}
$$
等价。</p>
<p>该优化问题称为广义拉格朗日函数的极小极大问题，设其最优解为 $p^\star$，对应参数为 $w^\star$ 和 $b^\star$。</p>
<a class="post-dummy-target" id="对偶问题及相关定理"></a><h4>对偶问题及相关定理</h4>
<p>定义对偶问题为广义拉格朗日函数的极大极小问题如下
$$
\max_{\alpha, \beta:\alpha_i \ge 0} \theta_D(\alpha, \beta) = \max_{\alpha, \beta:\alpha_i \ge 0} \min_{x} L(x, \alpha, \beta) = \max_{\alpha, \beta:\alpha_i \ge 0} (f(x) + \sum_{i = 1}^N \alpha_i c_i(x) + \sum_{j=1}^M \beta_j h_j(x) )
$$</p>
<p>设其最优解为 $d^\star$，对应参数为 $\alpha^\star$。</p>
<p>那么有如下定理：</p>
<p><strong>[定理1]</strong> 若原始问题与对偶问题都有最优解，那么他们满足</p>
<p>$$
d^\star = \max_{\alpha, \beta:\alpha_i \ge 0} \min_{x} L(x, \alpha, \beta) \le \min_{x} \max_{\alpha, \beta:\alpha_i \ge 0} L(x, \alpha, \beta) = p^\star
$$</p>
<p>证明：</p>
<p>由</p>
<p>$$
\theta_D(\alpha, \beta) = \min_{x} L(x, \alpha, \beta) \le L(x, \alpha, \beta) \le \max_{\alpha} L(x, \alpha, \beta) = \theta_P
$$</p>
<p>因此</p>
<p>$$
\max_{\alpha: \alpha_i \ge 0} \theta_D(\alpha, \beta) \le min_{x} \theta_P
$$</p>
<p>即</p>
<p>$$
d^\star \le p^\star
$$</p>
<p><strong>[推论]</strong> 若原始问题和对偶问题的可行解 $p^\star$ 和 $d^\star$ 相等，那么 $p^\star$ 和 $d^\star$ 就是最优解。</p>
<p><strong>[定理2]</strong> 若 $f(x)$ 和 $c_i(x)$ 是凸函数，$h_j(x)$ 是仿射函数（对于SVM最优化问题此条件一定满足），且存在 $x$ 使得所有样本点的 $c_i(x) &lt; 0$，那么原问题与对偶问题有解，且</p>
<p>$$
p^\star = d^\star = L(x^\star, \alpha^\star, \beta^\star)
$$</p>
<p><strong>[定理3]</strong> 若 $f(x)$ 和 $c_i(x)$ 是凸函数，$h_j(x)$ 是仿射函数（对于SVM最优化问题此条件一定满足），且存在 $x$ 使得所有样本点的 $c_i(x) &lt; 0$，则 $(w^\star, b^\star)$ 和 $\alpha^\star$ 分别为原始问题和对偶问题的解的充分必要条件是它们满足 <strong>KKT 条件</strong> (Karush-Kuhn-Tucker conditions)</p>
<p>$$
\nabla_x L(x^\star, \alpha^\star, \beta^\star) =  0  \\ 
$$</p>
<p>$$
c_i(x^\star) \le 0  \\ 
$$</p>
<p>$$
h_j(x^*) = 0
$$</p>
<p>$$
\alpha_i^\star c_i(x^\star) = 0  \\ 
$$</p>
<p>$$
\alpha_i^\star \ge 0
$$</p>
<p>特别指出，$\alpha_i^\star c_i(x^\star) = 0$ 称为 <strong>KKT 对偶互补条件</strong>。</p>
<ul>
<li>$c_i(x^\star) = 0$ 的点处 $\alpha_i^\star &gt; 0$</li>
<li>$c_i(x^\star) &lt; 0$ 的点处 $\alpha_i^\star = 0$</li>
</ul></div><div class="post-footer" id="post-footer">
    <div class="post-info">
        <div class="post-info-line">
            <div class="post-info-mod">
                <span>本文于 2020-11-20 更新</span>
            </div>
            <div class="post-info-license"></div>
        </div>
        <div class="post-info-line">
            <div class="post-info-md"></div>
            <div class="post-info-share"><span><a href="//www.linkedin.com/shareArticle?url=https%3a%2f%2fdiraclee.gitee.io%2f2020%2f11%2fsvm%2f&amp;title=SVM%20%e5%85%ac%e5%bc%8f%e6%8e%a8%e5%af%bc" target="_blank" title="分享到 LinkedIn">
            <i class="fab fa-linkedin fa-fw"></i>
        </a><a href="//service.weibo.com/share/share.php?url=https%3a%2f%2fdiraclee.gitee.io%2f2020%2f11%2fsvm%2f&amp;appkey=&amp;title=SVM%20%e5%85%ac%e5%bc%8f%e6%8e%a8%e5%af%bc" target="_blank" title="分享到 Weibo">
            <i class="fab fa-weibo fa-fw"></i>
        </a></span></div>
        </div>
    </div>

    <div class="post-info-more">
        <section><span class="tag">
                        <a href="https://diraclee.gitee.io/tags/%E6%9C%BA%E5%99%A8%E5%AD%A6%E4%B9%A0/"><i class="fas fa-tag fa-fw"></i>&nbsp;机器学习</a>&nbsp;
                    </span><span class="tag">
                        <a href="https://diraclee.gitee.io/tags/svm/"><i class="fas fa-tag fa-fw"></i>&nbsp;SVM</a>&nbsp;
                    </span></section>
        <section>
            <span><a href="javascript:window.history.back();">返回</a></span>&nbsp;|&nbsp;<span><a href="https://diraclee.gitee.io">主页</a></span>
        </section>
    </div>

    <div class="post-nav"><a href="https://diraclee.gitee.io/2020/10/sklearn-%E5%B8%B8%E7%94%A8%E6%A8%A1%E5%9E%8B%E5%8F%82%E6%95%B0/" class="prev" rel="prev" title="机器学习常用模型及其 sklearn 包"><i class="fas fa-angle-left fa-fw"></i>机器学习常用模型及其 sklearn 包</a>
            <a href="https://diraclee.gitee.io/2020/12/%E6%8E%A8%E8%8D%90%E7%B3%BB%E7%BB%9F%E7%BB%BC%E8%BF%B0/" class="next" rel="next" title="推荐系统综述">推荐系统综述<i class="fas fa-angle-right fa-fw"></i></a></div>
</div><div class="post-comment"></div>
    </article></div>
            </main><footer class="footer">
    <div class="copyright"><div class="copyright-line">由 <a href="https://gohugo.io/" target="_blank" rel="external nofollow noopener noreffer">Hugo</a> 强力驱动 | 主题 - <a href="https://github.com/dillonzq/LoveIt" target="_blank" rel="external nofollow noopener noreffer">LoveIt<i class="far fa-heart fa-fw"></i></a>
        </div>

        <div class="copyright-line"><i class="far fa-copyright fa-fw"></i><span itemprop="copyrightYear">2020</span><span class="author" itemprop="copyrightHolder">&nbsp;<a href="https://diraclee.gitee.io" target="_blank">Dirac Lee</a></span>&nbsp;|&nbsp;<span class="license"><a rel="license external nofollow noopener noreffer" href="https://creativecommons.org/licenses/by-nc/4.0/" target="_blank">CC BY-NC 4.0</a></span></div>
    </div>
</footer></div><a href="#" class="dynamic-to-top" id="dynamic-to-top" data-scroll>
            <span>&nbsp;</span>
        </a><script src="/js/lib/jquery/jquery.slim.min.js"></script><script src="/js/lib/lazysizes/lazysizes.min.js"></script><script src="/js/lib/smooth-scroll/smooth-scroll.polyfills.min.js"></script><script>window.scroll = new SmoothScroll('[data-scroll]', {speed: 300, speedAsDuration: true});</script><link rel="stylesheet" href="/css/lib/katex/katex.min.css"><script src="/js/lib/katex/katex.min.js"></script><script defer src="/js/lib/katex/auto-render.min.js"></script><link rel="stylesheet" href="/css/lib/katex/copy-tex.min.css"><script defer src="/js/lib/katex/copy-tex.min.js"></script><script defer src="/js/lib/katex/mhchem.min.js"></script><script>
        document.addEventListener("DOMContentLoaded", function () {
            renderMathInElement(document.body, {
                delimiters: [
                    { left: "$$", right: "$$", display: true },
                    { left: "\\(", right: "\\)", display: false },
                    { left: "\\[", right: "\\]", display: true },{ left: "$", right: "$", display: false },]
            });
        });
    </script><script src="/js/blog.min.js"></script></body>
</html>